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HEPP, A NOVEL GENE WITH A ROLE IN HEMATOPOIETIC AND NEURAL 

DEVELOPMENT 

[0001] This application claims priority to U.S. application no. 60/268,923, filed February 16, 
2001, which is incorporated herein by reference. 

FIELD OF THE INVENTION 

[0002] The invention relates to a novel conserved gene and protein product, designated Hepp 
(hematopoietic progenitor protein, that has a role in mammalian hematopoiesis and neural 
function. 

BACKGROUND INFORMATION 

[0003] The life-long maintenance and regenerative capacity of the hematopoietic system 
depends upon self-renewal and differentiation of pluripotent hematopoietic stem cells (HSC). 
The HSC give rise to all mature blood cell types by differentiating through intermediate 
progenitor cells that undergo lineage commitment and subsequent development along a single 
pathway (1-5). During the last two decades a highly complex regulatory network of molecular 
mechanisms, necessary to control lineage commitment and differentiation of blood cells has been 
identified, including growth factors and receptors, cell-cell interaction molecules, signal 
transduction molecules and transcription factors (6 -15). Due to distinct functional features of 
HSC, progenitors and mature blood cells, it is reasonable to assume that these properties are 
regulated at least in part by molecules that are preferentially expressed at particular stages of 
blood cell development. One approach to identify molecules that are important for self-renewal 
and lineage commitment of HSC and progenitors is to focus on rare populations of cells that are 
enriched for HSC and progenitors. Construction of HSC and progenitor cell- specific subtracted 
cDNA libraries, coupled with cDNA sequencing and microarray-based studies of gene 
expression patterns, will be necessary to molecularly define self-renewal, functional pluripotency 
and lineage commitment of HSC and progenitors and to elucidate the extraordinary 
developmental plasticity of HSC (16 -19). Using subtracted cDNA libraries and cDNA 
microarray approach Phillips et al (17) have recently reported results of a genomewide gene 
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expression analysis in mouse fetal liver HSC and progenitors. The complete data in the form of a 
database represent the first step in elucidating the molecular phenotype of hematopoietic stem 
cells and progenitors. 

[0004] Elucidation of the differential gene expression during differentiation of hematopoietic 
stem cell and progenitors should have far reaching implications for ex vivo manipulation of 
HSC, clinical bone marrow transplantation and gene therapy of hematological disorders. 

[0005] To identify novel molecules involved in intrinsic regulation of HSC and progenitor cell 
lineage commitment and differentiation we have generated full-length and subtracted cDNA 
libraries from mouse adult bone marrow cell populations enriched for HSC (Lin~Sca-l + cells) 
and progenitors (Lin'Sca-T cells) (19). Phenotypically and functionally defined population of 
primitive Lin~Sca-l + cells comprises 0.1-0.2% of bone marrow cells and contains virtually all 
HSC and primitive progenitors, whereas more differentiated Lin'Sca-l" cells contain committed 
progenitors but lack HSC. Here we describe cloning and characterization of a novel gene, 
Hepp, that is expressed preferentially in mouse fetal and adult hematopoietic progenitors and 
mature blood cells. 

[0006] Certain aspects of the present invention have been disclosed in Abdullah et al. (19). 
SUMMARY OF THE INVENTION 

[0007] Through differential screening of mouse hematopoietic stem cells (HSC) and progenitor 
substracted cDNA libraries, we have identified a progenitor cell-specific transcript that 
represents a novel conserved gene, designated Hepp (hematopoietic progenitor protein). Mouse 
and human Hepp genes encode proteins of 267 and 241 amino acids with no detectable known 
functional domains or motifs. The mouse gene and corresponding protein are set forth in SEQ 
ID NO:l and SEQ ID NO:2, respectively. The human gene and corresponding protein are set 
forth in SEQ ID NO: 3 and SEQ ID NO:4, respectively. During embryonic hematopoiesis Hepp 
is not expressed in mouse fetal liver HSC (Sca-1 + c-kit + AA4.1 + Lin" cells), but is abundantly 
transcribed in populations of hemotopoietic progenitors (AA4.T). In adult mice, Hepp is not 
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transcribed in highly enriched populations of bone marrow HSC (Rh"123 low Sca-lVto + Lin' 
cells), but its expression is unregulated as more heterogeneous population of bone marrow HSC 
(Lin Sea- 1 + cells) differentiates into progenitors (Lin'Sca-1" cells) and more mature lymphoid 
and myeloid cell types. The human gene was localized to chromosome 14q32, a region with 
frequent chromosome aberrations associated with multiple cases of acute myeloid leukemia, 
chronic lymphoproliferative disorder, acute lymphoblastic leukemia, non-Hodgkin's lymphoma, 
and myelodysplastic syndrome, for which the genes involved are unknown. Evolutionary 
conservation and differential expression in fetal and adult HSC and progenitors suggest that 
Hepp gene could play an important role in HSC/progenitor cell lineage commitment and 
differentiation, and could be involved in etiology of hematological malignancies. 

[0008] The gene and associated protein should be useful in a variety of contexts, for example, as 
reagents for differential identification of the tissue(s) or cell type(s) present in a biological 
sample and for diagnosis of diseases and conditions which include, but are not limited to, neural 
disorders and malignancies, particularly malignancies of the blood. Polypeptides of the invention 
and antibodies directed to these polypeptides are expected to be useful in providing 
immunological probes for differential identification of the tissue(s) or cell type(s), by means 
familiar to persons of skill in the art. For a number of disorders of neural and hematological 
tissues or cells, particularly of the nervous system and blood, expression of the Hepp gene at 
significantly higher or lower levels may be routinely detected in certain tissues or cell types 
relative to the standard gene expression level, i.e., the expression level in healthy tissue or bodily 
fluid from an individual not having the disorder. 

[0009] The tissue distribution of Hepp gene expression, and the characteristics of the Hepp 
knock-out mouse described herein, indicate that polynucleotides and polypeptides corresponding 
to this gene are useful for the detection, treatment, and/or prevention of neurodegenerative 
disease states, such as, for example, amyotrophic lateral sclerosis, and hematological disorders, 
particularly neoplasms of the blood such as acute myelomonocytic leukemia, lymphoblastic 
lymphoma, chronic lymphocytic leukemia, acute lymphoblastic leukemia, multiple myeloma, B- 
prolymphocytic leukemia, plasma cell leukemia, adult T-cell lymphoma/leukemia, diffuse large 
B-cell lymphoma, nodal marginal zone B-cell lymphoma, Burkitt's lymphoma, follicular 
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lymphoma, hairy cell leukemia, mantle cell lymphoma, splenic marginal zone B-cell lymphoma, 
and T-prolymphocytic leukemia. 

[0010] The terms "nucleic acid" "oligonucleotide", and "polynucleotide" are intended to include 
RNA, DNA, or RNA/DNA hybrid sequences of more than one nucleotide in either single chain 
or duplex form, and are used interchangeably herein. 

[0011] The terms "peptide", "polypeptide" and "protein", as used herein, refer to a sequence of 
naturally occurring amino acids, more particularly to a translated amino acid sequence generated 
from a polynucleotide of the invention. The proteins of the invention may in some instances 
have undergone postradiational modification. In general, "peptide" refers to a sequence of less 
than 10 residues, "polypeptide" refers to a sequence of 10 or more amino acid residues and as 
used herein is intended to encompass proteins as well. 

[0012] The terms "complementary" or "complement thereof, as used herein, refer to sequences 
of polynucleotides which are capable of forming Watson & Crick base pairing with another 
specified polynucleotide throughout the entirety of the complementary region. This term is 
applied to pairs of polynucleotides based solely upon their sequences and does not refer to any 
specific conditions under which the two polynucleotides would actually bind. 

[0013] For the purposes of this invention, when referring to nucleic acid and polypeptide 
sequences, percent similarity and percent identity are calculated according to the methods of 
CLUSTAL W (32). 

[0014] As used herein, the term "isolated" refers to material removed from its original 
environment (e.g., for naturally occurring substances, removed from their natural environment). 
Such material could be part of a vector or a composition of matter, or could be contained within 
a cell, if said vector, composition or cell is not the original environment of the material. 
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[0015] As used herein, the term "transgenic animal" is an animal containing a defined 
change to its germ line, wherein the change is not ordinarily found in the wild-type animal and 
can be passed on to the animal's progeny. The change to the animal's germ line can be an 
insertion, a substitution, or a deletion. In a broad sense, the term "transgenic" encompasses 
organisms where a gene has been eliminated or disrupted so as to result in the elimination of a 
phenotype associated with the disrupted gene ("knock-out (KO) animals"). The term "transgenic" 
also encompasses organisms containing modifications to their existing genes and organisms 
modified to contain exogenous genes introduced into their germ line. 

[0016] It is one object of the invention to provide an isolated nucleic acid comprising a 
sequence that is at least 70% identical to SEQ ID NO:l or SEQ ID NO:3, or a sequence that is 
complementary thereto. Preferably the sequence is at least 77% identical, more preferably 80% 
identical, and even more preferably 85%, 90%, 95%, 98%, 99% or 100% identical to SEQ ID 
NO:l or SEQIDNO:3. 

[0017] The invention also provides an isolated nucleic acid comprising a sequence that is at 
least 70% identical to a fragment of SEQ ID NO: 1 or SEQ ID NO:3, the fragment representing at 
least 50 contiguous bases, preferably 100 contiguous bases and most preferably 150 contiguous 
bases of SEQ ID NO:l or SEQ ID NO:3. Preferably the sequence is at least 77% identical, more 
preferably 80% identical, and even more preferably 85%, 90%, 95%, or 100% identical to said 
contiguous bases. 

[0018] The nucleic acids of the invention may be produced recombinantly, synthetically, or by 
any means available to those of skill in the art, and may be cloned using techniques known in the 
art. In this regard, the invention also includes a vector comprising the nucleic acid of the 
invention, and a host cell comprising the nucleic acid of the invention. 

[0019] The invention also provides a polypeptide or protein comprising an amino acid 
sequence that is at least 70% identical to SEQ ID NO:2, SEQ ID NO:4, SEQ ID NO:5, SEQ ID 
NO:6, or SEQ ED NO:7. Preferably the sequence is at least 75% identical, more preferably 
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80% identical, and even more preferably 85%, 90%, 95%, 98%, 99% or 100% identical to one of 
SEQIDNO:2, SEQ ID NO:4, SEQ ID NO:5, SEQ ID NO:6, orSEQIDNO:7. In an alternate 
embodiment, the amino acid sequence is at least 60%, preferably 70%, 75%, 80%, and more 
preferably 85%, 90%, 95%, 98%, 99% or 100% similar to SEQ ID NO:2, SEQ ID NO:4, SEQ 
ID NO:5, SEQ ID NO:6, or SEQ ID NO:7. 



[0020] The invention also provides an isolated polypeptide or protein comprising a sequence 
that is at least 70% identical to a fragment of SEQ ID NO:2, SEQ ID NO:4, SEQ ID NO:5, SEQ 
ID NO:6, or SEQ ID NO:7, the fragment representing at least 10 contiguous amino acid 
residues, preferably 20 contiguous amino acid residues and most preferably 50 contiguous 

g residues of SEQ ID NO:2, SEQ ID NO:4, SEQ ID NO:5, SEQ ID NO:6, orSEQIDNO:7. 

M Preferably the sequence is at least 75% identical, more preferably 80% identical, and even more 

n preferably 85%, 90%, 95%, 98%, 99% or 100% identical to said contiguous residues. 

CP 

3 [0021] The polypeptide of the present invention may be produced by conventional methods of 

O 

?y chemical synthesis or by recombinant DNA techniques. For example, a host microorganism may 

}* be transformed with a DNA fragment encoding the polypeptide and the polypeptide harvested 

Ul 

p from the culture. The host organism may be, for example, a bacterium, a yeast or a mammalian 

ru 

cell, whereby the DNA fragment in question is integrated in the genome of the host organism or 
inserted into a suitable expression vector capable of replicating in the host organism. The DNA 
fragment is placed under the control of regions containing suitable transcription and translation 
signals. Methods for obtaining polypeptides by these means are familiar to persons skilled in the 
art. 



[0022] The invention also provides a non-human mutant vertebrate, or "knock-out (KO) 
animal" in which the Hepp gene has been impaired at one or both loci in somatic and germ cells. 
A "knock-out animal" is an animal in which selected genes have been mutated to prevent 
expression of functional protein products. In this regard, the invention provides a non-human 
mutant vertebrate, in which all or some of the germ and somatic cells contain a mutation in at 
least one Hepp locus, which mutation is introduced into the vertebrate, or an ancestor of the 
vertebrate, at an embryonic stage. The term "vertebrate" encompasses mammals, birds, reptiles, 
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amphibians, and fishes that possess a Hepp gene or equivalent. Preferably the vertebrate is a 
non-human mammal, most preferably a mouse, rat, or rabbit. 

[0023] In one preferred embodiment, the mutation produces a phenotype in a mammal 
characterized by perturbed hematopoiesis consisting of bone marrow cytopenia, overproduction 
and/or accumulation of hematopoietic progenitors, and splenomegaly with follicular hyperplasia. 
In an especially preferred embodiment, the vertebrate is a mouse that is heterozygous or 
homozygous for HEPP", a knock-out gene that results in a reduction or absence of functional 
HEPP protein. Such mice can be obtained by treating mouse embryos with ES cell clone 

M KST303. Other means of producing mutant animals, such as knock-in techniques, are familiar to 

% those of skill in the art. 

H 

m 

p [0024] The invention also provides a means of producing a KO mammal, in particular a 

~^ mouse, that is heterozygous or homozygous for a defective Hepp gene (e.g. Hepp ). In one 

^ method of producing the transgenic animals, transformed ES cells containing a disrupted Hepp 

fy gene having undergone homologous recombination, are introduced into a normal blastocyst. The 

fl blastocyst is then transferred into the uterus of a pseudo-pregnant female for gestation and 

%r a 

O delivery. Resulting heterozygous mutant animals are then bred to obtain homozygous mutant 

animals. Other means of producing KO animals are familiar to those of skill in the art. Examples 
are disclosed in U.S. Pat. No. 6,015,676 (Lin et al.) and Gene Knockout Protocols. In: Methods 
in Molecular Biology, vol. 158, 2001. Edited by: M.J. Tymms and I. Kola. Humana Press, 
Totowa, New Jersey, incorporated herein by reference. 

[0025] The mutant vertebrate of the invention may be one in which all of the germ and somatic 
cells contain the mutation, i.e., the vertebrate is either a heterozygote or a homozygote for the 
mutation. The vertebrate may be one wherein both of the Hepp alleles in all of the germ and 
somatic cells contain the mutation, i.e., the vertebrate is a homozygote for the mutation. 
Alternatively, the vertebrate may be a chimera (an animal in which only some of the germ and 
somatic cells contain the mutation). 
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[0026] The mutant vertebrate of the invention should be useful, inter alia, in screening drugs 
for the treatment of neurodegenerative disorders such as amyotrophic lateral sclerosis (ALS) and 
testing of novel hematopoietic cytokines/growth factors for mobilization and differentiation of 
stem and progenitor cells. 

BRIEF DESCRIPTION OF THE DRAWINGS 

[0027] Fig. 1 A. Complementary DNA sequence (SEQ ID NO:l) and deduced amino acid 
sequence (SEQ ID NO:2) for the mouse Hepp gene. The 5' in-frame stop codon found upstream 
of the start codon is underlined in the nucleotide sequence. The stop codon is indicated by an 
asterisk. The polyadenylation signal-like sequence is underlined in bold. The nucleotide 
sequence data reported here appear in the GenBank nucleotide sequence databases under 
Accession No. AF322238. 

[0028] Fig. IB. Complementary DNA sequence (SEQ ID NO:3) and deduced amino acid 
sequence (SEQ ID NO:4) for the human HEPP gene. The 5' in-frame stop codon found upstream 
of the start codon is underlined in the nucleotide sequence. The stop codon is indicated by an 
asterisk. The polyadenylation signal-like sequence is underlined in bold. The nucleotide 
sequence data reported here appear in the GenBank nucleotide sequence databases under 
Accession No. 322239. 

[0029] Fig. 2. ClustalW amino acid sequence alignment of the mouse (SEQ ID NO: 2) and 
human (SEQ ID NO:4) HEPP proteins. 

[0030] Fig. 3. Amino acid sequence alignment of the N terminal portion of zebrafish (SEQ ID 
NO:5), mouse (SEQ ID NO:6) and human (SEQ ID NO:7) HEPP proteins. 

[0031] Fig. 4. Northern analysis of Hepp expression in adult mouse tissues, (a) Hepp is 
transcribed at a very low level in heart, lung, spleen, and thymus and at a higher level in 
muscle, (b) Hybridization with actin probe as a positive control. 
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[0032] Fig 5. Semiquantitative duplex PCR and RT-PCR expression analysis of Hepp and 
HPRT (control) in mouse fetal liver and adult bone marrow HSC and progenitor cell 
populations. 

[0033] Fig. 6. Semiquantitative duplex RT-PCR expression analysis of Hepp and HPRT in 
various hematopoietic cell lines demonstrates that mouse Hepp is ubiquitously expressed in 
different stages of lymphoid and myeloid cell development. 

[0034] Fig. 7. Chromosomal localization of Hepp and hematological malignancies associated 
with rearrangements of the band q32 on chromosome 14. 

[0035] Fig 8. Hepp is ubiquitously expressed in neural stem cells and progenitors and 
differentiated neural cell types. 

[0036] Fig. 9. Expression pattern of mouse Hepp in central and peripheral nervous system. 
[0037] Fig. 10. Genotyping of the progeny from the breeding of heterozygous Hepp* 1 ' mice. 

[0038] Fig. 11. Significantly reduced number of BM cells in femurs and tibias of Hepp+/~ 
mice. 

[0039] Fig. 12. Decreased content of B cells, granulocytes, macrophages and 
erythroblasts in the BM of Hepp+I- mice (flow cytometry analysis). 

[0040] Fig. 13. Increased content of BM cell populations containing progenitors and HSC in 
Hepp+I- mice as analyzed by flow cytometry. 

[0041] Fig. 14. Increased content of myelo-erythroid and lymphoid progenitors in the 
BM of Hepp+I- mice (colony- forming assays) 
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[0042] Fig. 15A-B. Splenomegaly in Hepp+I- mice. 
Spleens of Hepp+/+ (15 A) and Hepp +/- (15B) mice 

[0043] Fig. 15C. Significantly increased number of splenocytes in Hepp+/- mice. 

[0044] Fig. 16. Increased content of B220+ cells in the spleen of Hepp+I- mice (flow cytometry 
analysis). 

[0045] Fig. 17. Decreased content of CD8+ T cells in the spleen of Hepp+I- mice (flow 
cytometry analysis). 

[0046] Fig. 18. Progressive neurodegenerative disease in affected Hepp+I- mice. 

DETAILED DESCRIPTION OF THE INVENTION 
MATERIALS AND METHODS 

Fluorescence-Activated Sorting of Mouse Hematopoietic Stem and Progenitor Cells. 
[0047] Pheno typically and functionally defined populations of primitive Lin"Sca-l + cells 
(comprising 0.1-0.2% of bone marrow cells and containing virtually all HSC and primitive 
progenitors) and more differentiated Lin"Sca-l" cells (containing committed progenitors but 
lacking HSC) (20 -23) were isolated from the bone marrow of 6- to 8-week-old C57BL/6J mice 
(Taconic, Germantown, NY). Cell sorting was conducted as described previously (21), using the 
FACStar Vantage flow cytometer (Becton-Dickinson Immunocytometry Systems, San Jose, 
CA). 

Library Construction and Subtractive Hybridization 

[0048] Poly(A) RNA (0.5 fig) was isolated from sorted Lin"Sca-l and Lin"Sca-l" cells using 
Micro-FastTrack mRNA isolation procedure (Invitrogen). Full-length Lin"Sca-l + and LnfSca-1" 
cell-specific cDNA libraries were constructed in XZAPII vector using CapFinder cDNA library 
construction method, according to manufacturer's protocol (Clontech, Palo Alto, CA, USA). Lin" 
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Sca-1 + (titer 4.8 X 10 10 pfu/ml) and Lin'Sca-1" (titer 5.6 X 10 10 pfu/ml) cell-specific libraries 
were arrayed (2 X 10 6 clones) into a 96-well format for efficient PCR-based screening (24). Lin" 
Sca-1 + and Lin'Sca-T cell-specific subtracted cDNA libraries were constructed by suppression 
subtractive hybridization (25, 26) using a PCR-Select kit (Clontech). Double-stranded cDNAs 
were synthesized from mouse Lin"Sca-l + and Lin'Sca-l" bone marrow cell poly(A) + RNA, 
digested with Rsal, and used as both tester and driver in reciprocal subtractive hybridization. 
After two rounds of hybridization portions of reactions were subjected to two rounds of PCR to 
selectively amplify differentially expressed cDNAs, which were cloned into pGEM-T vector 
(Promega, Madison, WI). Individual clones from subtracted cDNA libraries were arrayed as dot 
blots in a 96-well format and hybridized with labeled probes derived from tester and driver 
cDNAs (19). Confirmed differentially expressed cDNA clones were sequenced and analyzed 
using computer-assisted search of GenBank/EMBL and UniGene databases 
(www.ncbi.nlm.nih.gov/UniGene/index.html). 

Cloning and Sequence Analysis of Hepp cDNA 

[0049] Mouse cDNA for Hepp was isolated by PCRbased screening of arrayed full-length 
Lin2Sca-l2 cell-specific cDNA library (24). The longest isolated clones were sequenced and 
derived Hepp cDNA was analyzed using the nonredundant and EST division of the GenBank 
database, UniGene database, and SMART (Simple Modular Architecture Research Tool, 
http://smart.embl-heidelberg.de/). Proteome WormPD database 

(http://www.proteome.com/databases) and DRES {Drosophila Related Expressed Sequences) 
Search Engine (http://hercules.tigem.it/DRES/dres.html) (27) were used to identify 
Caenorhabditis elegans and Drosophila orthologs of Hepp. 

Expression Analysis 

[0050] Mouse multiple tissue Northern blot was purchased from OriGene Technologies Inc. 
(Rockville,MD). In vitro transcribed partial Hepp cDNA was labeled with North2South HRP 
Direct labeling kit (PIERCE, Rockford, IL) and used as a nonradioactive probe. Blot was 
prehybridized (30 min) and hybridized (1 h) at 55°C, washed according to manufacturer's 
instructions, and exposed to X-ray film (Kodak) using Du Pont intensifying screens. 
Hybridization with non-radioactively labeled actin probe was used as a positive control. 
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[0051] Expression in mouse fetal and adult HSC and progenitor cell populations was analyzed 
by (a) semiquantitative PCR screening of cDNA libraries from fetal liver HSC (Sca-lV 
A;zY + AA4.1 + Lin" cells), fetal liver progenitors and mature blood cells (AA4.T), and adult bone 
marrow HSC (Rh-123 1ow Sca-l + c-/:zY + Lin' cells), and (b) semiquantitative reverse transcription 
PCR (RT-PCR) using first strand cDNAs prepared from sorted Lin~Sca-l + and LmSca-Tbone 
marrow cells according to the manufacturer's protocol (Clontech). Sca-l + c-A^ + AA4.1 + Lin", 
AA4.T, and Rh-123 low Sca-l + c-kit + Liri cell-specific cDNA libraries (prepared by Clontech's 
CapFinder cDNA library construction method) were a kind gift from Dr. Ihor Lemischka 
(Princeton University). 

[0052] Both PCR and RT-PCRs were performed in duplex using different dilutions of cDNA 
libraries and first strand cDNAs, with mouse Hepp primers amplifying a 446-bp fragment (5' 
oligo 5 ' -CG AAGG AGTGGCGGGGTCTG-3 ' [SEQ ID NO: 8]; 3' oligo 5'- 
TTCCTTTGCCCTCGTGCTGA-3 1 [SEQ ID NO: 9]), and primers for hypoxanthine- guanine- 
phosphorybosyltransferase (HPRT) (5' oligo S'-GTTGAGAGATCATCTCCACC-S' [SEQ ID 
NO: 10]; 3' oligo 5'-AGCGATGATGAACCAGGTTA-3' [SEQ ID NO: 11]) which amplify a 
340-bp fragment as an internal positive control. Reactions were performed in an Eppendorf 
Mastercyler for 25-40 cycles (95°C for 30 s, 57°C for 45 s, 72°C for 30 s). Hepp expression in 
various hematopoietic lineages was also assessed by semiquantitative duplex RT-PCR. A panel 
of the following lineagespecific mouse hematopoietic cell lines was used: LyD9 (pluripotent 
progenitor cell line), FDC-P1 (myeloid progenitor cell line), 1881 (pro-B cell line), BaF/3 and 
70Z/3 (pre-B cell lines), CH33 and Ml 2 (B cell lines), NFS-70 (pro-B cell lymphoma), NFS-5 
(pre-B cell lymphoma), A20 and WEHI-279 (B cell lymphoma lines), J558 (B cell myeloma), 
EL4 and WEHI 7.1 (T cell lymphoma), WEHI-3B (myelomonocytic cell line), and RAW 309 
and J774A.1 (monocyte-macrophage cell lines). These cell lines can be obtained from the 
American Type Culture Collection (ATCC) Manassas, VA 20108. Total RNA (2 /xg) from each 
cell line was reverse transcribed using random hexamers (Pharmacia, Piscataway, NJ) and 
MMLV reverse transcriptase (GIBCO) in a 20-/xl reaction, and 2 /xl of the first-strand cDNA was 
used as a template in a duplex PCR (30 cycles; 95°C for 30 s, 57°C for 45 s, 72°C for 30 s) with 
primers for Hepp and HPRT. 
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RESULTS 

Isolation and Analysis of Full-Length cDNA for Hepp 

[0053] After differential screening of subtracted Lin~Sca-l + and Lin'Sca-1" cell-specific libraries, 
differentially expressed cDNA clones were subjected to automated sequencing and computer 
assisted analysis. BLAST search of the GenBank/EMBL database identified one of the ESTs 
(LS215), isolated from Lin'Sca-T cell-specific subtracted cDNA library, as a novel gene. Based 
on the preferential expression in adult bone marrow progenitors the gene was designated Hepp 
for hematopoietic progenitor protein. Mouse cDNA clone for Hepp was isolated by PCR-based 
screening of arrayed full-length Lin'Sca-T cell specific cDNA library using sequence-specific 

i* primers. The two longest isolated clones were sequenced and analyzed. Mouse Hepp transcript 

0 

PI (2082 bp) contains an open reading frame (ORF) of 71 1 bp with one in-frame stop codon 

upstream of the first ATG codon, and encodes a protein of 237 amino acids (theoretical Mr 26.1 

01 

p kDa) with no known domains or motifs (Accession No. AF322238) (Fig. 1 A). In the UniGene 

ff\ 

.J database mouse Hepp cDNA is represented by one cluster of uncharacterized ESTs (Mm.28595). 
^ Search of the human EST division of the GenBank database with the mouse Hepp cDNA 
fy sequence identified several homologous ESTs, that are identical to human FLJ20764 cDNA 
[JJ (Accession No. AK000771) of unknown function. FLJ20764 cDNA (1918 bp) contains partial 
ORF (609 bp) that encodes a 202 amino acid protein similar to mouse Hepp protein and is 
represented by one cluster of uncharacterized ESTs Hs. 34045) in the UniGene database. 
According to the NCBI HomoloGene (www.ncbi.nlm.nih.gov/HomoloGene/, a homology 
resource which includes both curated and calculated orthologs and homologs for human, mouse, 
rat, zebrafish, cow and fly genes represented in the UniGene), mouse ESTs from UniGene 
cluster Mm.28595 and human hypothetical protein FLJ20764, represented by the UniGene 
cluster Hs.34045, are calculated orthologs with 88% sequence identity. All human ESTs from 
cluster Hs.34045 were assembled into a single contig with EST Assembly Machine 
(http://gcg.tigem.it/cgi-bin/uniestass.pl), conceptually translated in all six frames (http:// 
dot.imgen.bcm.tmc.edu:9331/seq-util/seq-util.html) and compared with nucleotide and amino 
acid sequence of mouse Hepp and human FLJ20764 cDNA. Electronically extended cDNA 
(2082 bp) for human FLJ20764 contains an ORF of 723 bp with one in-frame stop codon 
upstream of the first ATG codon and encodes a 241 amino acid protein (theoretical Mr 26.1 kDa) 
(Accession No. AF322239) (Fig. IB). ClustalW amino acid sequence alignment (32) has shown 
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that mouse Hepp and human FLJ20764 proteins share 73% identity and 77% similarity, with 
several highly conserved contiguous blocks of amino acids (Fig. 2), again indicating that 
FLJ20764 gene most likely represents the human ortholog of the mouse Hepp gene. Based on 
SMART analysis (Simple Modular Architecture Research Tool, http://smart.emblheidelberg.de/) 
(28), SwissProt database search, and search of the Conserved Domain Database using RPS- 
BLAST (http://www.ncbi.nlm.nih.gov/Structure/cdd/wrpsb.cgi), both mouse and human Hepp 
proteins lack any known domains or motifs, and do not have any obvious homology or 
structural similarities to known proteins. SignalP VI. 1 Server (http://www.cbs.dtu.dk/services/ 
SignalP/) did not predict the presence of N-terminal signal peptide or signal peptide cleavage 
sites in mouse and human Hepp protein. NetOGlyc 2.0 Server 

(http://www.cbs.dtu.dk/services/NetOGlyc/) has predicted one putative mucin type O- 
glycosylation site in mouse Hepp protein (Thr 213) and three putative O-glycosylation sites in 
human HEPP protein (Thr 81, 122, 217). NetPhos 2.0 protein phosphorylation prediction server 
(http:// www.cbs.dtu.dk/services/NetPhos/), which predicts for serine, threonine, and tyrosine 
phosphorylation sites in eukaryotic proteins, has found 14 putative phosphorylation sites both in 
the mouse Hepp (Ser: 11; Thr: 2; Tyr: 1) and the human HEPP protein (Ser: 11; Thr: 2; Tyr: 1) 
(data not shown). 

Identification and Analysis of Invertebrate and Vertebrate Orthologs of Hepp 

[0054] Using the Proteome WormPD database (http://www.proteome.com/databases), DRES 

{Drosophila related expressed sequences) Search Engine (27) 

(http://hercules.tigem.it/DRES/dres.html) and Drosophila Genome Project Blast Search 
(http://www.fruitfly.org/cgi-bin/blast/public_blaster.pl) we were not able to identify a C. elegans 
or Drosophila ortholog of Hepp gene. By screening the Gen-Bank nonredundant database with 
mouse cDNA we have identified several rat (UniGene cluster Rn. 16249) and one zebrafish EST 
(Accession No. AW422282), similar to Hepp gene. All rat ESTs 

in cluster Rn. 16249 represent the 3' untranslated region (3* UTR) of rat Hepp cDNA and thus 
could not be conceptually translated and compared with mouse and human HEPP proteins. 
However, at the nucleotide sequence level 3' UTR of rat Hepp cDNA shares 88 and 86% identity 
with mouse and human HEPP cDNAs, respectively (data not shown). The zebrafish EST, 
representing partial cDNA, was conceptually translated, analyzed with SMART, and compared 
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with protein sequence of mouse and human HEPP. ClustalW amino acid sequence alignment has 
shown that partial zebrafish Hepp protein shares 64% identity and 74% similarity with mouse 
Hepp protein, and 66% identity and 76% similarity with human HEPP protein (Fig. 3). The 
alignment of the N-terminal part of the zebrafish, mouse and human HEPP proteins 
demonstrates a high degree of evolutionary conservation of the amino terminal part of the protein 
and again shows several highly conserved contiguous blocks of amino acids (Fig. 3). 

Expression Analysis of Hepp 

[0055] Hybridization of mouse multiple tissue Northern blot has revealed that Hepp is expressed 
at a very low level in the heart, lung, spleen and thymus, and at a higher level in the muscle. The 
heart and muscle express a larger -4.8-kb transcript, whereas lung, spleen, and thymus express a 
smaller ~4-kb isoform, which probably arises through alternative splicing. Hepp transcripts are 
not detectable in the brain, kidney, liver, skin, intestine, stomach, and testis (Fig. 4). Since Hepp 
was found to be expressed preferentially in a progenitor cell population after the differential 
screening of subtracted Lin"Sca-l + and Lin'Sca-T cell-specific libraries, it was important to 
reanalyze its expression in populations of mouse fetal and adult HSC and progenitors. Repetitive 
semi-quantitative duplex PCR analysis (using various dilutions of cDNA libraries as the template 
and 25-40 PCR cycles) has shown that Hepp is not expressed in mouse fetal liver HSC (Sca-lV 
foY + AA4.1 + Lin cells), but is highly expressed in progenitor cell population (AA4.T cells) (Fig. 
5). Similarly, using semi-quantitative duplex PCR with various dilutions of cDNA library and 
25-40 PCR cycles, Hepp transcript was not detectable in highly purified population of Rh- 
123 Iow Sca-lVta + Lin bone marrow cells. This population represents -0.001% of normal bone 
marrow cells and is highly enriched for HSC activity (17, 29). Interestingly, expression of Hepp 
was found to be upregulated as more heterogeneous population of HSC and progenitors (Lin 
Sca-1 + cells, representing 0.1-0.2% of normal bone marrow cells) differentiates into progenitors 
(Lin"Sca-l" cells), as analyzed by semiquantitative duplex RT-PCR (Fig. 5). These findings 
confirm the results of differential screening of Lin Sca-l + and Lin Sca-1" cell-specific subtracted 
libraries. RT-PCR analysis of various hematopoietic cell lines has shown that Hepp is 
ubiquitously expressed in lymphoid progenitor, pro-B, pre-B and B cell lines including 
lymphomas, in T cell lymphoma cell lines and thymus, and in myeloid progenitor and 
monocyte-macrophage cell lines (Fig. 6). 
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Human HEPP maps to chromosomal region with frequent chromosome aberrations associated 
with multiple cases of various hematological malignancies 

[0056] Using the CELERA Gene Discovery System and BAC mapping it was determined that 
mouse Hepp gene maps to telomeric part of the chromosome 12, whereas human HEPP gene 
maps to q32 region on human chromosome 14, depicted in Figure 7. According to Breakpoint 
Map of Recurrent Chromosome Aberrations database 

(http://www.ncbi.nlm.nih.gov/CCAP/mitelsum.cgi), band 14q32 represents a region with 
frequent balanced (translocations) and unbalanced chromosome aberrations (deletions, 

U duplications) associated with multiple cases of various hematological malignancies (Table 1), for 

j~ some of which the genes involved are unknown. 

w 

•r. 

5 Tab,e 1 

en 



5 Neoplasm Cases 

q Acute myelomonocytic leukemia 5 

ry Lymphoblastic lymphoma 13 

^ Chronic lymphocytic leukemia 185 

y*j Acute lymphoblastic leukemia 316 

p Multiple myeloma 190 

ry B-prolymphocytic leukemia 24 

Plasma cell leukemia 35 

Adult T-cell lymphoma/leukemia 31 

Diffuse large B-cell lymphoma 324 

Nodal marginal zone B-cell 2 
lymphoma 

Burkitt's lymphoma 127 

Follicular lymphoma 515 

Hairy cell leukemia 8 

Mantle cell lymphoma 158 

Splenic marginal zone B-cell 21 

lymphoma 
T-prolymphocytic leukemia 51 



Mapping of the human HEPP to a chromosomal region with frequent chromosome aberrations 
associated with multiple cases of various hematological malignancies, suggests that HEPP is 
involved in etiology of some of the hematological malignancies. 
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Mouse Hepp is expressed in almost all parts of central and peripheral nervous system 
throughout embryonic development and in adult mice, and is expressed in neural stem cells and 
progenitors 

[0057] Expression of Hepp was analyzed in mouse fetal neural stem cells (NSC) and 
progenitors. Cultures of NSC and progenitors from El 4 embryonic brain sub ventricular zone 
(SV) were established in the presence of bFGF (basic Fibroblast Growth Factor or FGF-2, 
necessary for NSC maintenance) and were induced to differentiate into neurons and glial cells by 
withdrawal of bFGF or by addition of ciliary neurotrophic factor (CNTF), platelet derived 
growth factor (PDGF-p) or bone morphogentic protein 7 (BMP7), cytokines used to drive 
differentiation of NSC in to neurons and glia cells in culture. Cyclophilin A was used as an 
internal positive control in RT-PCR. These experiments have established that Hepp is 
ubiquitously expressed in fetal NSC, progenitors and differentiated neural cell types (Fig. 8). 

[0058] Using Mouse Brain Rapid-Scan Panel (OriGene Technologies), the expression pattern of 
Hepp in embryonic and adult central and peripheral nervous system was analyzed. The results 
demonstrated that Hepp is expressed in almost all parts of central and peripheral nervous system 
throughout embryonic development and in adult mice (including forebrain, midbrain, hindbrain, 
spinal cord) (Fig. 9). 

Generation of Hepp knockout mice 

[0059] Searching the database of trapped genes (Dr. William Skarnes, UC Berkeley) 
( http://socratesberkelev.edu/-starnes/resourse.html) , we identified ES clone KST303 in which 
allele for HEPP was trapped by ATG-less secretory gene trap vector pGT1.8TM $geo. The gene 
trap vector pGT1.8TM figeo contains a splice acceptor sequence and transmembrane protein 
domain TM of CD4 gene upstream of a reporter and is activated following insertion into an 
intron. The analysis of trapping event in ES cell clone KST303 showed proper splicing of the 
integrated vector and fusion of the Pgeo reporter to the 5' UTR of HEPP transcript, which should 
result in severely truncated transcript and absence of functional HEPP protein. Using ES cell 
clone KST303 we generated HEPP knockout mice. ES clones with targeted Hepp alleles can be 
generated by routine means by a practitioner skilled in gene targeting techniques. (See, for 
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example, Gene Knockout Protocols. In: Methods in Molecular Biology, vol. 158, 2001. Edited 
by: M.J. Tymms and I. Kola. Humana Press, Totowa, New Jersey.) 

[0060] Viable heterozygous Hepp mice were bred to generate Hepp -/- mice (Fig. 10). 
Genotyping of progeny from breeding of Hepp+I- mice has revealed that the vast majority 
(80%) of Hepp-/- mice die in utero (Fig. 10; Table 2). 

Table 2. Genotyping and ratio of adult Hepp +/+ , Hepp +/ ' and Hepp' 1 ' mice. 



Genotype 


Hepp +/+ 


Hepp +/ 


Hepp 7 


Total 


Number of mice 


15 


34 


3 


52 












Experimental Mendelian ratio 


23.5% 


53% 


4.7% 




Theoretical Mendelian ratio 


16 (25%) 
1 


32 (50%) 
2 


16 (25%) 
1 


64 



Analysis of hematopoietic system in Hepp KO mice 

[0061] The analysis of 23 Hepp* 1 ' mice revealed perturbed hematopoiesis consisting of bone 
marrow cytopenia, overproduction and/or accumulation of hematopoietic progenitors, and 
splenomegaly with follicular hyperplasia. In addition, Hepp+I- mice have significantly reduced 
number of bone marrow (BM) cells in femurs and tibias (Fig. 1 1). 

[0062] Flow cytometry analysis revealed decreased content of B cells, granulocytes, 
macrophages and erythroblasts in the BM of Hepp+I- mice (Fig. 12). In contrast, the content of 
BM cell populations containing (a) immature hematopoietic cells (lineage negative Lin cells), 
(b) early and late progenitors (Lin Sea- 1' and Lin"c-kit" cells), and (c) early progenitors and HSC 
(Lin~Sca-l + and Lin"c-kit + cells) was increased in Hepp+I- mice (Fig. 13). Furthermore, colony- 
forming assays demonstrated increased content of blast colony- forming (CFU-Blast), myelo- 
erythroid (CFU-GM, BFU-E and CFU-Meg) and lymphoid (CFU-B) progenitors in the BM from 
Hepp+I- mice (Fig. 14). 

[0063] Another readily observable feature was very frequent splenomegaly in Hepp+I'™zs> 
with significantly increased number of splenocytes and follicular hyperplasia (Fig. 15). This 
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follicular hyperplasia was accompanied by increased content of B220+ cells in the spleen of 
Hepp+I- mice as analyzed by flow cytometry (Fig. 16). Flow cytometry analysis of myeloid 
cells (granulocytes, macrophages and erythroblasts) in the spleen did not show any difference 
between wild type and Hepp+/~ mice (data not shown). We have also observed slight decrease in 
the content of CD4 + T cells and significantly decreased content of CD8+ T cells in the spleen of 
Hepp+I- mice as analyzed by flow cytometry (Fig. 17). 

Analysis of central and peripheral nervous system in Hepp KO mice 
[0064] The last facet of the phenotype is progressive neuromuscular degeneration in Hepp +/- 
mice. About 40% oiHepp+l- mice show slight tremor, impaired balance during walking, and 
very mild paralysis of hind legs. Mice have difficulty turning over when placed on their backs in 
a supine position. 

[0065] After 4 months of age about 10% of affected Hepp+I- mice exhibit full paralysis of hind 
legs, seizures, severe muscular atrophy and wasting (Fig. 18). Mice with full penetrance of the 
progressive neurodegenerative disease do not survive beyond 6 months of age. A review of 
current literature and mouse models (e.g. mice lacking hypoxia-response element of VEGF; 
Oosthuyse B, Moons L, Storkebaum E, et al. (2001). Deletion of the hypoxia-response element 
in the vascular endothelial growth factor promoter causes motor neuron degeneration. Nat. 
Genet. 2001 Jun;28(2):131-138), supports the conclusion that adult-onset progressive 
neurodegenerative disease in Hepp+I' mice has features that closely resemble amyotrophic 
lateral sclerosis (ALS). Accordingly, these mice show promise of being a useful model for the 
study of this human disease. 

CONCLUSIONS 

[0066] In summary, the multifaceted phenotype oiHepp+l~ mice consists of at least the 
following features: 

[0067] 1 . Skeletal defects and growth retardation, indicating that Hepp plays a role in 
embryonic development, 

DC2D0CSl/351992vl iq 



attorney Docket 39532-176599 



[0068] 2. Perturbed hematopoiesis that encompasses: bone marrow cytopenia, overproduction 
and accumulation of hematopoietic progenitors, and splenomegaly with follicular hyperplasia, 
and 

[0069] 3. Adult-onset progressive neurodegenerative disease reminiscent of amyotrophic lateral 
sclerosis (ALS), which suggests a role for Hepp in neuronal development and function. 

[0070] The complex phenotype of Hepp KO mice suggests that Hepp is a part of common 
molecular mechanism utilized in the development and differentiation of hematopoietic and 
neuronal cells and perhaps other cell types as well. 



DISCUSSION 

[0071] Differential screening of subtracted cDNA libraries from mouse fetal and adult cell 
populations enriched for HSC and progenitors and sequencing of differentially expressed clones 
have already yielded a number of both novel as well as evolutionary conserved genes, present 
from Drosophila to humans (16, 17, 19, 31). Described herein is the cloning and 
characterization of a novel gene, Hepp, identified through differential screening of subtracted 
cDNA libraries from mouse adult bone marrow cell populations enriched for HSC (Lin Sca-1 + 
cells) and progenitors (Lin"Sca-l~ cells) (19). Mouse Hepp and human HEPP transcripts encode 
novel conserved proteins without any known structural or functional domains or motifs, and 
lacking any obvious homology or structural similarities to known proteins. Furthermore, lack of 
invertebrate orthologs and a high degree of evolutionary conservation of the peptide sequence in 
zebrafish, mouse and humans suggest that in vertebrates Hepp gene has an important conserved 
although as yet not completely elucidated function. Differential screening of mouse bone 
marrow HSC (Lin Sea- 1 + ) and progenitor (Lin'Sca-T) cell-specific subtracted cDNA libraries 
has demonstrated that Hepp is expressed preferentially in progenitor cell populations (Lin'Sca-T 
cells). During embryonic blood cell development Hepp is not expressed in the population of 
mouse fetal liver HSC (Sca-1 Vto + AA4.1 + Lin' cells), but is abundantly transcribed in fetal liver 
progenitors and mature blood cells (AA4.T cells). These results are in agreement with the fact 
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that in the BLAST search of the Stem Cell Database (SCDB; http://stemcell.princeton.edu/; Dr. 
Ihor Lemischka, Princeton University) mouse Hepp cDNA did not match any ESTs derived from 
the Sca-l + c-A:# + AA4.1 + Lin" cell-specific subtracted library, containing transcripts expressed 
preferentially in mouse fetal liver HSC population (17, 30). 



[0072] Similarly, during adult mouse hematopoiesis, Hepp is not transcribed in the population of 
Rho-123 ,ow Sca-lVto + Lin" cells (representing -0.001% of normal bone marrow cells and highly 
enriched for HSC) (17, 29), but is expressed at low level in more heterogeneous population of 
Lin"Sca-l + cells (representing 0.1-0.2% of normal bone marrow cells and enriched for HSC and 
|4 progenitors). Hepp transcription is upregulated in progenitor cell population (Lin'Sca-T cells) 
2 ^d in various lymphoid and myeloid cell lines. Therefore, mouse Hepp exhibits 

M developmentally regulated pattern and conservation of preferential expression in both fetal and 

CP 

p adult hematopoietic progenitors and mature blood cells during embryonic and adult 

I 

3 



hematopoiesis. Restricted expression pattern in tissues and preferential expression in mouse 
fetal and adult hematopoietic progenitors and mature blood cells suggest that mouse Hepp is 
involved in the regulation of HSC and progenitor cell lineage commitment and differentiation. 

[0073] In describing preferred embodiments of the present invention, specific terminology is 
employed for the sake of clarity. However, the invention is not intended to be limited to the 
specific terminology so selected. It is to be understood that each specific element includes all 
technical equivalents, which operate in a similar manner to accomplish a similar purpose. Each 
reference cited herein is incorporated by reference as if each were individually incorporated by 
reference. 



[0074] The embodiments illustrated and discussed in the present specification are intended only 
to teach those skilled in the art the best way known to the inventors to make and use the 
invention, and should not be considered as limiting the scope of the present invention. The 
exemplified embodiments of the invention may be modified or varied, and elements added or 
omitted, without departing from the invention, as appreciated by those skilled in the art in light 
of the above teachings. It is therefore to be understood that, within the scope of the claims and 
their equivalents, the invention may be practiced otherwise than as specifically described. 
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